Statistical Analysis of Airport Network of China * 
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Through the study of airport network of China (ANC), composed of 128 airports (nodes) and 
1165 flights (edges), we show the topological structure of ANC conveys two characteristics of small 
worlds, a short average path length (2.067) and a high degree of clustering (0.733). The cumulative 
degree distributions of both directed and undirected ANC obey two-regime power laws with different 
exponents, i.e., the so-called Double Pareto Law. In-degrees and out-degrees of each airport have 
positive correlations, whereas the undirected degrees of adjacent airports have significant linear an- 
ticorrelations. It is demonstrated both weekly and daily cumulative distributions of flight weights 
(frequencies) of ANC have power-law tails. Besides, the weight of any given flight is proportional 
to the degrees of both airports at the two ends of that flight. It is also shown the diameter of each 
sub-cluster (consisting of an airport and all those airports to which it is linked) is inversely propor- 
tional to its density of connectivity. Efficiency of ANC and of its sub-clusters are measured through 
a simple definition. In terms of that, the efficiency of ANC's sub-clusters increases as the density 
of connectivity does. ANC is found to have an efficiency of 0.484- 

PACS numbers: 89.40.Dd; 89.75.Da; 89.75.-k 



The ER model yj of random graphs, introduced by 
Erdos and Renyi, has attracted much attention from both 
m .,„e m a t icia„ s and physicists iBHE0. and h»c e - 
forth shaped our understanding of networks for decades. 
The growing interest on whether randomness dominates 
real-world networks, however, was eventually prompted 
by recent advances in two main streams of topics. One 
part of these work was related to "small worlds" , origi- 
nally described as "six degrees of separation" 7] which 
manifests that humans are connected through a short, 
limited chain of acquaintances. The concept was success- 
fully employed by Watts and Strogatz [E Q in exploring 
the dynamics of a great variety of networks between order 
and randomness, e.g., the actor and actress networks [Io|. 
the chemical reaction networks jllj . the rumor spread- 
ing networks 0, and the food webs E3j e ^ c - Another 
parallel achievement was made by the research team of 
Barabasi EE EE which led to the finding of a 
class of networks with scale-free degree distributions, for 
example, Internet |13l , the networks of co-authorship in 
natural sciences IitI. the web of sexual contacts EE] j and 



the graph of human language EHj etc. 

Composed of a number of airports and flights, air net- 
works are simply normal examples of transportation sys- 
tems among ubiquitous networks in nature. Nevertheless, 
they appear extraordinary and unique due to the follow- 
ing features: (a) quite limited system sizes, from a few 
hundred to a few thousand at most; (b) relatively sta- 
tionary structures with respect to both time and space; 
(c) bi-directional, weighted links (flights) with slightly 
fluctuating frequency. 

This paper will present investigations of airport net- 
work in China (ANC). We demonstrate that on one hand 
ANC embodies part features of small worlds and of scale- 
free networks. On the other hand, however, ANC ex- 
hibits more features belonging to its topological unique- 
ness. The whole text is organized as follows. Section I 
presents the results on degree distributions and degree 
correlations of ANC. Section II gives the flight weight 
distributions and the weight-degree correlation of ANC. 
Section III analyzes the clustering coefficients of ANC. In 
Section IV we calculate the diameter of ANC and discuss 
the efficiency of ANC by proposing a simple definition for 
it. Conclusions and discussions are given in the last part, 
section V. 
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I. DEGREE DISTRIBUTIONS AND DEGREE 
CORRELATIONS 



ANC consists of N = 128 [20| airports (nodes) and 
1165 flights (edges) that connect most major cities in 
China. The topology of ANC can be symbolized by a 
128 x 128 x 7 connectivity matrix C whose entry CVjt 
is 1 if their is a link pointing from node i to node j 
at the t-th day of a week (Herein and after t=l, 2, 3, 
4, 5, 6, and 7 specifies the seven days within a week, 
starting from Monday, respectively.) and otherwise, 
and a 128 x 128 x 7 weight matrix W 21] whose element 
is defined as 



fli-if 

w " = zjf- 



(i) 



where n^t is the number of flights i — > j at the t- 
th day. Wijt satisfies the normalization condition, i.e., 
J2t^{i,j}Wij t = 1. Normally, C ijt = C jit and W ljt = 
Wjn only hold for undirected ANC. 

We employ kf n {i) and k™ u (i) to denote the in-degree 
and out-degree of a given node i in the directed ANC 
during a whole week time, and k™ n (i) to represent the 
undirected degree of the undirected ANC in the same 
week. Hence, we have 



(2) 



Fig. 1A presents behaviors of the three distributions. 
It is amazing to find that all three distributions follow 
nearly a same two-regime power law with two different 
exponents, known as Double Pareto Law Q, with a 
turning point at degree value fc c ~ 26, which can be well 
prescribed by the following expression, 



P{K > fc) 




for fc < fc c 
for fc > fc r 



(5) 



where 71 and 72 are the respective degree exponents of 
two separate power laws. By means of fitting, exponents 
pairs (71, 72) of the three distributions in Fig. 1A are 
(0.428, 4.161), (0.416, 4.453), and (0.45, 4.535). Using a 
simple algebra, the original distributions of kf n {i) {k™ u (i), 
or k™ n (i)) can be written as, 



P(k) = 



dP(K > fc) 
dk 



fc-(72+l) j for fc > fc c 



(6) 



where k specifies the three different degrees above. Corre- 
spondingly, the mean values of kf n {i), k™ u (i), and k™ n (i) 
are 18.931, 17.156 and 18.203. This conveys that each 
airport, on average, is connected to around 18 other air- 
ports. 

The undirected degree of a certain airport i at the t-th 
day of a week is given by 



1 



(7) 



and 



j t=l 



1), 



(3) 



i¥* 7 

E^E^ + ^i-i), (4) 
j t=i 



where tj(x) is a unit step function, which takes 1 for x > 
and otherwise. 

First we consider the three distributions of kf n {i), 
kg U (i), and k™ n (i), respectively. Here the cumulative dis- 
tribution, widely used in economies and well known as the 
Pareto Law [22j, is adopted to reduce the statistical er- 
rors arisen from the limited system size. The cumulative 
form, P(fc&(») > fc) (») > fc), or (i) > fc)), 

gives the probability that a given airport i has an in- 
degree (out-degree or undirected degree) larger than fc. 



The cumulative distributions of fc„„, with t =1, 2, 3, 4, 5, 
6, and 7, shown in Fig. IB, reflects the daily evolution of 
the topology of the undirected ANC within a week. It is 
evident from Fig. IB that the distributions of days from 
Monday to Saturday nearly coincide with one another, on 
a same Double Pareto Law. The distribution of Sunday, 
however, deviates apparently from the shared curve and 
itself obeys another law. By checking the original data, 
one may find out the discrepancy is mainly caused by 
the fact that some flights are not available on Sundays. 
Exponents pairs and average undirected degrees of the 
undirected ANC for each day of one week are listed in 
Table I. As we can see, the values of 71 and 72 in the 
table are in general (except on Sundays) slightly larger 
than the counterparts of undirected ANC during a whole 
week. The average degrees of each day, around 14 (12 on 
Sundays), are much smaller than 18, the counterpart of 
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a week. This is understandable because many nights are 
only available on certain days of a week. 

We also check an important feature of ANC, the degree 
correlations. First we come to the correlation between 
in-degrees and out-degrees, simply called in-out degree 
correlation. This is quite natural for airport networks 
because each airport should generally maintain the bal- 
ance of its traffic flow. Normally, for each airport, the 
higher its in-degree, the higher its out-degree. We plot 
kf n (i) versus k™ u (i) (i=l,2,...,128) in Fig. 2. The follow- 
ing expression can be obviously obtained by fitting the 
data, 



Kuii) 



(8) 



Evidently, the in-out degree correlation is very strong. 

Another possible correlation exists between the degrees 
of adjacent airports, named degree-degree correlation. 
The degree-degree correlation tells that the degrees are 
not independent and correlate with those of their neigh- 
bors. It can be demonstrated by calculating the mean 
degree of the neighbors of a given airport as a function 
of the degree of that airport. Fig. 3 presents our analysis 
of degree-degree correlation in the undirected ANC. As 
shown, the degrees of adjacent airports have significant 
anticorrelations, based on which the ANC appear to be 
disassortative |24| . But the anticorrelation found in ANC 
is almost linear, different than that found in Ref. [2^, 
which is a power law with exponent of about -0.5. 



II. FLIGHT WEIGHT DISTRIBUTIONS AND 
WEIGHT-DEGREE CORRELATION 



An important feature of ANC is that some flights are 
more frequent than others. The weight or the frequency 
of a certain flight, henceforth, states the extent to which 
it is busy. The weight of flight i — > j in a whole week is 
given by 



t=i 



(9) 



The cumulative distribution of W% , P{W% > W), gives 
the probability that a flight has a weight larger than w. 
Shown in Fig. 1C, P(W-j > W) has a power-law tail, 



where 7 = 1.65. Through a simple algebra, one may 
obtain P(W) ~ (IF)" 2 - 65 . Such a power-law tail indi- 
cates that the probability of finding a very busy flight is 
nonzero, and significant instead. The daily cumulative 
distributions of Wijt within a week is given in Fig. ID. 
Amongst the seven distributions, those from Monday to 
Saturday obey the same power law, while Sunday data 
reveals a steeper power law that extends a narrower re- 
gion on the x-coordinate. The exponents of flight weight 
distributions of each day are also presented in Table I 
and are slightly larger than 1.65. 

We also conjecture if there is a certain kind of relation 
between the weight of a given flight and the degrees of the 
two airports at both ends of that flight. We simply call it 
weight-degree correlation. Without losing the generality, 
we propose the following ansatz for the possible existence 
of such correlation, 



w$ ~ [00 * Oi)] 1/2 - 



(ii) 



This scaling ansatz has been well supported by the real 
data, shown in Fig. 4. 

III. CLUSTERING COEFFICIENT 

The neighborhood T v of a given airport v is a graph 
which includes all nodes who have flights with v. The 
clustering coefficient |?J C(T V ) of neighborhood T v of air- 
port v characterizes the extent to which airports in r„ 
are connected to every other. In precise words, 



C(T V ) 



E(T V ) 



(12) 



where E(T V ) is the number of real connections in T v con- 
sisting of m airports, and is the total number of all 
possible connections in r„. The average clustering coef- 
ficient of the entire air network is defined as, 



r„ 



(13) 



P(W% > W) ~ IF" 7 , 



(10) 



where N is the number of airports of the whole network. 
By calculation, C of the entire undirected ANC for a 
whole week is 0.733, in stark contrast with the low den- 
sity of connectivity, (k)/N, 0.143. The C of the daily 
undirected ANC given in Table I centralizes 0.600, the 
value for Sunday being slightly lower. 
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IV. DIAMETER AND EFFICIENCY 

For a connected network, the diameter D can have the 
following definition, 



D 



1 



N(N- l)/2 



(14) 



where d m i n (i,j) represents the shortest-path length be- 
tween nodes i and j. In an air network, the diameter 
D indicates the average number of transfers a passenger 
need to take between the start and the end. For ANC, 
D is around 2.067. Specifically, d min (i,j) in ANC only 
takes three distinctive values, 1, 2 and 3, with percent- 
ages of 0.143, 0.646, and 0.211, respectively. This implies 
most trips will need one intermediate transfer or two be- 
fore the final destinations, only a small percent can be 
reached directly. 

The high clustering and the small diameter inevitably 
indicates the small- world property of ANC. For compari- 
son, random graphs of the same average degree, (fc), and 
the same number of nodes, A, with ANC are investi- 
gated. It is readily to learn that the average cluster- 
ing coefficient of random graphs, 0.143, is much smaller 
than 0.733, the weekly average clustering coefficient. The 
diameter of such random graphs, scales as hxN/hx(k), 
which is 1.672, less than the counterpart of ANC. 

A practical thing of ANC is related to its transporta- 
tion efficiency, which tells us how one can travel from one 
place to another both quickly and economically. Let us 
first take a look at the efficiency of sub-clusters of ANC. 
A sub-cluster here is composed of a hub v, the central 
node, and its neighborhood T(v) consisting of whoever 
has flights with the hub. The largest sub-cluster of ANC 
includes 84 airports, and the smallest one, only 2. In 
terms of graph theory, the sub-clusters consist of only 
two kinds of structure, trees and triangles. The density 
of connectivity of a sub-cluster with m nodes and E(T V ) 
edges in T(v) of the hub is, 



Pdc 



2{E{T v )+m) 
m 2 + to 



(15) 



The diameter of the sub-cluster, d sc , can be readily de- 
rived, 



2(m 2 - E(T V )) 

m 2 + 777 



The plot of D sc versus pdc, for all 128 sub-clusters of 
ANC, is presented in Fig. 5A, which can be well fitted 
by a straight line. The larger pd c is, the more direct con- 
nections there exist in the sub-clusters, and the smaller 
the diameter will be. In the case of a complete graph, 
the diameter will be definitely 1. 

We simply define the efficiency of sub-clusters of ANC, 



m + 777 



2{m 2 -E{T v ))' 



(17) 



(16) 



After a simple calculation, E sc versus p sc is presented in 
Fig. 5B. It's clearly shown that the higher the density 
of connectivity, the higher the efficiency of a sub-cluster. 
The efficiency is 1 when the sub-cluster is totally con- 
nected. This agrees with our intuition. 

Compared with its sub-clusters, ANC itself displays no 
more difference in structure. The ANC can be viewed as 
a cluster with hierarchical structure 26] , composed of a 
center, e.g., Beijing, and whoever has direct connections 
with the center, and whoever has no direct connections 
with the center, but with whoever has, and so on. For a 
connected network, such a cluster can include all nodes 
in the same system. By analyzing the real data, each 
node of ANC is connected to any other with no more 
than three steps. In this sense, Eq. Ijl7(l also applies to 
ANC. After some algebra, we find the efficiency of ANC 
is 0.484. 

V. CONCLUSIONS AND DISCUSSIONS 

In conclusion, our analysis reveals two characteristic 
small- world properties of ANC, a short average path 
length and a high degree of clustering. Another impor- 
tant feature of ANC, the degree distribution, however, is 
strikingly different from counterparts of both scale-free 
networks and of random graphs. In ANC there exist 
strong, positive correlations between in-degrees and out- 
degrees of each airport, and significant anticorrelations 
between degrees of adjacent airports. The weekly and 
daily weight distributions of ANC display power-law be- 
haviors. The existence of weight-degree correlation of 
ANC shows that there is an dependence of the weight of 
a certain flight on the degrees of the two airports at both 
ends of that flight. In particular, we suggest a rough 
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idea to measure the efficiency of ANC and that of its 
sub-clusters. 

In the previous sections we do not answer why the 
structure of ANC obeys double Pareto Law. Here we 
come up with a simple idea which can be realized through 
computer simulation. Suppose one constructs a whole 
airport network from the very beginning, with only a few 
airports in major cities, following two simple rules. Un- 
der the first rule, preferential attachment [13| , a newly es- 
tablished airport tends to connect to the hubs with more 
flights, which naturally drives the airport network to de- 
velop a structure beyond those of random graphs. The 
second rule manifests the existence of different growth 
rates of airports between the region of smaller airports 
and that of larger ones. Thats is, in the early history 
of airport network construction, smaller airports have 
considerable probabilities to be growing to accommodate 
more flights. Gradually, as most major airports have 
been established, the smaller airports were unlikely to 
expand any more. Hence more small-sized airports were 



established. This limited growth endows the airport net- 
work features part of scale-free topology. It may be more 
appropriate to say that ANC has an intermediate topol- 
ogy between random graphs and scale- free networks. 

Another issue should be addressed to the efficiency. 
The efficiency based on our definition is solely idealistic 
and only limited to the structure of the network itself. 
It is more appropriate to call it structural efficiency. In 
the reality of air transportation, the carriers (airlines) 
should consider more factors in order to have a higher 
and reasonable efficiency. That is, one needs to know 
how an air network can satisfy the passengers' needs on 
one hand, and ensure the profits of airlines, on the other 
hand. This should be an interesting topic and worth 
investigating. 

W.L. would like to thank Alexander von Humboldt 
Stiftung for research funding and Prof. Juergen Jost of 
Max-Planck Institute for Mathematics in the Sciences for 
hosting. 
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Figure Captions: 

Fig. 1: Cumulative degree distributions of ANC for 
undirected degree, in-degree, and out-degree of (A) a 
whole week and (B) each day from Monday to Sunday. 
Cumulative weight distributions of ANC for (C) a whole 
week and (D) each day from Monday to Sunday. 

Fig. 2: Correlation between in-degrees and out-degrees 
of the directed ANC in a whole week. 

Fig. 3: Correlation between degrees of adjacent air- 



ports of the undirected ANC in a whole week. 

Fig. 4: Weight-degree correlation of the undirected 
ANC in a whole week. 

Fig. 5: (A) Diameter and (B) Efficiency versus density 
of connectivity for sub-clusters of the undirected ANC. 

Table Captions: 

TABLE I: Comparison of relevant variables of daily 
undirected ANC (from Monday to Sunday): (1) 71 and 
(2) 72 are exponents of two power-laws of cumulative 
degree distributions; (3) (k), the average degree; (4) 7, 
the exponent of flight weight distributions; (5) C, the 
clustering coefficient of the whole system. 
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